Video signal processing using triplets of pixels

ABSTRACT

A de-interlacing process takes a weighted sum of pixels in a filter aperture to generate a pixel in an output picture, the weighted sum including products of triplets of pixels. Using a training sequence of progressive material, it is possible to calculate the weighting coefficients necessary to minimize the mean square error between the filter output and the desired result.

FIELD OF THE INVENTION

This invention relates to video signal processing and especially toprocesses of interpolation, particularly spatial interpolation, whetherhorizontal, vertical or two dimensional. The invention applies in animportant example to the process of de-interlacing by which a videoframe is derived for each field of an interlaced video signal.

BACKGROUND

A known de-interlacing technique derives the “missing” lines through aweighted sum of neighbouring sample points. The location of the samplepoints to be employed and the values of the weighting coefficients arechosen to minimise visual artefacts and certain design principles havebeen established.

Adaptive techniques have emerged by which the characteristics of thede-interlacing filter are changed in the face of—for example—motion.

SUMMARY OF THE INVENTION

It is an object of aspects of the present invention to provide improvedvideo signal processing by which the appearance of visual artefacts onspatial interpolation is further minimised.

It is a further object of one aspect of the present invention to provideimproved video signal processing by which a video frame is derived foreach field of an interlaced video signal.

Accordingly, the present invention consists in one aspect in a videoprocess wherein a weighted sum of pixels from at least one input pictureis taken in a filter aperture to generate a pixel in an output picture,characterised in that the weighted sum includes products of triplets ofpixels.

Suitably, a video frame is derived through spatial interpolation fromeach video field of an interlaced input signal.

In one form of the invention, the weighted sum comprises pixels andproducts of triplets of pixels.

In another aspect, the present invention consists in a video process ofinterpolation, wherein adaption is provided between a process of spatialinterpolation in which a weighted sum of products of pixels from aninput picture is taken in a filter aperture to generate a pixel in anoutput picture, and a process of temporal interpolation which a weightedsum of pixels from two or more input pictures is taken in a filteraperture to generate a pixel in an output picture.

In yet another aspect, the present invention consists in video signalprocessing apparatus for interpolation, comprising an interpolationfilter taking a weighted sum of pixels from at least one input picturein a filter aperture, to generate a pixel in an output picture,characterised in that the weighted sum includes products of triplets ofpixels.

BRIEF DESCRIPTION OF THE DRAWINGS

This invention will now be described by way of example with reference tothe accompanying drawings, in which:

FIG. 1 is a diagram of a de-interlacing circuit according to the presentinvention;

FIG. 2 is a diagrammatical representation of a four tap third orderfilter useful in accordance with the present invention;

FIG. 3 is a diagram illustrating a process for designing a filteraccording to the present invention;

FIG. 4 is a series of diagrams illustrating filter apertures for use inthe present invention;

FIG. 5 is a diagram of an interpolating circuit according to oneembodiment of the present invention; and

FIG. 6 is a diagram of an interpolating circuit according to a furtherembodiment of the present invention.

DETAILED DESCRIPTION

In one embodiment of this invention, the aim is to interpolate one fieldof a video frame from another. This is known as de-interlacing.

There is shown in FIG. 1, a de-interlacing circuit in which aninterlaced video signal at input terminal 10 is operated upon to form aprogressive signal at output terminal 12. A filter 14 receives one fieldof a video frame and from it interpolates the other field of the frame.A multiplexer 16 receives these “new” fields, as well as the originalfields (appropriately delayed at 18). The output of the multiplexer is aprogressive scan video signal.

In a traditional de-interlacing circuit, the filter 14 is linear, eachfilter tap (derived by appropriate delay elements from the input video)is multiplied by a filter weight and the resulting products are summedto give the filter output. In contrast, the present invention proposes apolynomial non-linear filter. This includes, in addition to the linearterms, the sum of filter coefficients multiplied by products of pixelvalues, triplets of pixel values, etc. For example, a four tap filterfor use in the present invention will contain four filter coefficientswhich are multiplied by single pixel values, ten filter coefficientswhich are multiplied by products of pixel values, twenty filtercoefficients which are multiplied by triplets of pixel values, etc.

In any practical embodiment, the polynomial series must be truncated atsome point. A filter truncated at the third order is convenient andthere is shown diagrammatically in FIG. 2, a four tap third order filterfor use as filter 14 of FIG. 1. The polynomial filter is illustratedgraphically as the combination of a linear filter 100, a quadraticfilter 102 and a cubic filter 103. The linear filter utilises threedelay elements 104 to generate four taps from the input signal. Each tapis multiplied by a coefficient in a respective multiplier 105 and aweighted sum generated in summing device 106. In the quadratic filter102, similar delay elements 114 provide four taps from the input videosignal and ten multipliers 115 generate all ten possible products, againweighted by respective coefficients. A sum is formed in summing device116. The cubic filter has twenty multipliers 125 operating on the tapsfrom delay elements 124 to generate all possible combinations oftriplets of taps and a weighted sum is formed in summing device 126.Although the delay elements 104, 114 and 124 have been shown separatelyin the three filters, one set of delay elements would usually suffice.

It will be understood that in any practical circuit there are very manyways of embodying the described filter. Typically, a single processingelement will receive the four taps and with appropriate multipliers,coefficient stores and one or more summing devices, output directly thesum of the linear, quadratic and

To understand the technique of constructing a filter according to thepresent invention, it is helpful to look at FIG. 3. The object is todesign an N point digital finite impulse response filter, h, to modifythe input, x(n), in such a way as to minimise the mean square error,e(n), between the filter output and the desired signal, y(n). In thecase of de-interlacing, x(n) is field f1, and y(n) is field f2. The aimis to create a filter h(n) that when operated on f1, gives the bestpossible estimate of f2 such that the mean squared error between theestimate of f2 and actual f2 is minimised. In FIG. 3, a progressiveinput is proved to block 30 which separates the fields of a video frameand outputs field f1 and field f2. Field f1, that is to say x(n) isprovided to the filter 32 to generate an estimate of f2. This is thencompared in block 34 with the actual f2, that is to say y(n).

The filter impulse response which minimises the sum of the squarederrors of data of length L, is given by the solution of theover-determined (assuming L>N) system of equations Xh = y  where  $\quad {X = {\begin{bmatrix}{x\quad (L)} & {x\quad \left( {L - 1} \right)} & \cdots & {x\quad \left( {L - N + 1} \right)} \\{x\quad \left( {L - 1} \right)} & {x\quad \left( {L - 2} \right)} & \cdots & {x\quad \left( {L - N} \right)} \\\vdots & \vdots & \quad & \vdots \\{x\quad (2)} & {x\quad (1)} & \cdots & 0 \\{x\quad (1)} & 0 & \cdots & 0\end{bmatrix}\quad {and}}}\quad$ $\quad {y = {\begin{bmatrix}{y(L)} \\{y\quad \left( {L - 1} \right)} \\\vdots \\{y\quad (2)} \\{y\quad (1)}\end{bmatrix}.}}$

the least squares solution of which is,

h=(X ^(T) X)⁻¹ X ^(T) Y.

where X^(T)X=R is known as the auto-correlation matrix and X^(T)y=p isknown as he cross correlation vector. Note X^(T)X and X^(T)y are usuallymuch smaller than X. Hence, it is much more efficient to compute X^(T)Xand X^(T)y directly from x(n) and (n) rather than to form X.

The extension of this to a more general non-linear model is in principlesimply a matter of modifying the data matrix X. Below we show the datamatrix for a second order polynomial non-linear filter, in which aconstant (DC) term has also been included. A symmetric form for thenon-linear components of the filter has been assumed so this matrix hasdimension$L \times {\left( {N + \frac{N\left( {N + 1} \right)}{2}} \right).}$

$X = \left\lbrack \quad \begin{matrix}1 & {x(L)} & {x\left( {L - 1} \right)} & \cdots & {x\left( {L - N + 1} \right)} & {x(L)}^{2} & {{x(L)}\quad {x\left( {L - 1} \right)}} & \cdots & {{x(L)}\quad {x\left( {L - N + 1} \right)}} & {x\left( {L - 1} \right)}^{2} & \cdots & {x\left( {L - N + 1} \right)}^{2} \\1 & {x\left( {L - 1} \right)} & {x\left( {L - 2} \right)} & \cdots & {x\left( {L - N} \right)} & {x\left( {L - 1} \right)}^{2} & {{x\left( {L - 1} \right)}\quad {x\left( {L - 2} \right)}} & \cdots & {{x\left( {L - 1} \right)}\quad {x\left( {L - N} \right)}} & {x\left( {L - 2} \right)}^{2} & \cdots & {x\left( {L - N} \right)}^{2} \\1 & {x\left( {L - 2} \right)} & {x\left( {L - 3} \right)} & \cdots & {x\left( {L - N - 1} \right)} & {x\left( {L - 2} \right)}^{2} & {{x\left( {L - 2} \right)}\quad {x\left( {L - 3} \right)}} & \cdots & {{x\left( {L - 1} \right)}\quad {x\left( {L - N - 1} \right)}} & {x\left( {L - 3} \right)}^{2} & \cdots & {x\left( {L - N - 1} \right)}^{2} \\\vdots & \vdots & \vdots & \quad & \vdots & \vdots & \vdots & \quad & \vdots & \vdots & \quad & \vdots \\1 & {x(3)} & {x(2)} & \cdots & 0 & {x(3)}^{2} & {{x(3)}\quad {x(2)}} & \cdots & 0 & {x(2)}^{2} & \cdots & 0 \\1 & {x(2)} & {x(1)} & \cdots & 0 & {x(2)}^{2} & {{x(2)}\quad {x(1)}} & \cdots & 0 & {x(1)}^{2} & \cdots & 0 \\1 & {x(1)} & 0 & \cdots & 0 & {x(1)}^{2} & 0 & \cdots & 0 & 0 & \cdots & 0\end{matrix}\quad \right\rbrack$

The optimal filter, in the least squares sense, can then be estimated bysolving h=R⁻¹p. The filter will contain three separate components; theDC term, the standard linear coefficients which should be multiplied bysingle pixel values, and the quadratic coefficients which will bemultiplied by product of pixel values.

The present invention recognises that if the mean square error is chosenfor optimisation of the filter, it is possible to calculate the filtercoefficients h without forming a trial filter and iterating. Thetraining process then represents not an iterative improvement in a trialor prototype filter, but the collection of sufficient data from realpicture material for which both x and y are known, to enable calculationof meaningful auto-correlation matrix and cross correlation vector.

A polynomial model truncated at the third order is preferred accordingto this invention. This will contain linear, quadratic, and cubicfilters and so is able to model systems which contain both quadratic andcubic non-linear elements. These generate both skewed and symmetricdistortions of the probability density function. Higher order models canbe used and are shown to give improved results but the size of thefilter and the computation required in its estimation rise exponentiallyand there are rapidly diminishing returns. For example, the fifth order,six pixel cubic non-linear filter does perform better than the thirdorder, six pixel filter but there are over five times as many terms.

For the linear case, it is found that neither increasing the number oftaps in the vertical direction, of a six point vertical filter norutilising pixels in the horizontal direction, significantly reduces themean squared error. However, for a filter according to the presentinvention, the choice of aperture has much more dramatic results. Forexample, a two dimensional aperture does give a significant improvementover a one dimensional one. This is thought to be due to the ability ofthe non-linear filter to deal with sloping edges and lines and utilisegradient information.

However, as can be seen in Table 1, the number of filter coefficientsrises exponentially with the number of pixels. Due to computationalconstraints a sensible maximum size is presently taken for a cubicfilter of 20 pixels and for a fifth order filter, 6 pixels.

Total number of filter coefficients for third and fifth order non-linearfilters containing 4,6,8,12 and 20 pixels. Number of filter coefficientsNo. of pixels Third order non-linear filter Fifth order non-linearfilter 4 35 126 6 84 462 8 165 1287 12 445 6178 20 1770 53129

As the number of pixels available is limited it is important to choosethe correct shape of aperture. Best results seem to occur from aperturesthat contain four vertical pixels and then a number of horizontalpixels. The apertures used for the 4, 6, 8 and 20 pixel filters areshown in FIG. 4 (X denotes the pixels used in field, f1, to estimate thepixel denoted by O in field, f2). The use of horizontal informationhelps to cope with the near horizontal lines and edges that often causeproblems due to jagging in de-interlacing.

Table 2 shows the mean squared error between the estimated field andactual field for a particular reference picture, for a series ofdifferent filters. It can be seen that in all cases, the non-linearfilters perform better than standard linear filters.

TABLE 2 Mean squared errors for various filters used on EBU referencepicture “Girl”. Mean squared error between the estimate of the Number offield and the actual coefficients Filter type field in filter 2 pixellinear filter (0.5/0.5) 23.15 2 4 pixel linear filter (optimum) 18.61 48 pixel linear filter (optimum) 18.58 8 36 pixel linear filter (optimum)18.55 36 4 pixel cubic filter (optimum) 16.14 35 6 pixel cubic filter(optimum) 15.67 84 6 pixel fifth order filter (optimum) 14.88 462 8pixel cubic filter (optimum) 15.21 165 12 pixel cubic filter (optimum)14.69 445 20 pixel cubic filter (optimum) 13.50 1770

It is found that the non-linear filter produces much smoother edges andcurves than its linear counterpart, with reduced jagging.

Finally, the mean square error is given for a series of pictures for alinear, and two non-linear filters, (Table 3). It can be seen that inall cases the non-linear filters perform as well as or better than thelinear filters.

TABLE 3 Mean squared error for standard EBU pictures Error for Error forError for Picture 4 pixel linear 4 pixel cubic 12 pixel cubic Blackboard49 43 43 Boats 73 67 66 Boy 73 64 57 Clown 23 20 20 Girl 19 17 17 Pond166 153 140 Tree 313 303 303 couple 91 87 86 Kiel 135 128 128 Latin 363289 234

Non-linear polynomial filters in accordance with the present inventioncan give dramatically improved performance over conventional linearpredictors when used for spatial de-interlacing. Polynomial non-linearfilters are generally more complex than their linear equivalents,although only using spatial information reduces the complexitysignificantly as compared to conventional spatio-temporal filters. Theincreased performance seems to occur mainly along edges; whereas linearfilters often produce jagging on diagonal lines and curves, thenonlinear filters described here considerably reduce such artifacts.

A non-linear filter for use in the present invention can be implementeddirectly as a set of multipliers and an adder as so far described or thesame charateristic can be achieved using a lookup table.

A de-interlacing circuit can operate independently or the de-interlacingfunction can be incorporated within a circuit operating on an interlacedsignal for the purposes of standards conversion, upconversion,downconversion, aspect ratio conversion, digital video effects and soon.

Thus, turning to FIG. 5, there is shown a circuit which operates on aninterlaced video signal received at input terminal 500 to providethrough interpolation an output video signal at terminal 502. This maybe an interlace or a progressive signal and may have different numbersof lines per field, different numbers of fields per second and so on,depending upon the specific function of the circuit. One example wouldbe an interlaced output in a different television broadcast standard tothe input.

The input signal of FIG. 5 is passed to a polynomial filter 504 that inone example takes the form illustrated symbolically in FIG. 2. Theoutput of filter 504, comprising the “new” fields, passes through a FIFO506 to a series chain of delay elements 508. The original fields aretaken through a delay 514 to a similar FIFO 516 and delay elements 518.

A weighted sum of the filter taps generated by the delay elements 508and 518 is taken by means of multipliers 520 and summing device 522. Theoutput of the summing device 522 is taken through a FIFO 524, to theoutput terminal. The coefficients of the multipliers are set throughcontrol unit 526, which also serves to control the rates at which datais read into and read out of the FIFO's 506, 516 and 524.

The skilled man will recognize that through appropriate choice of thedelay elements and control of the FIFO's and multipliers, a wide varietyof interpolation procedures can be conducted.

In another arrangement, the interpolation process is “folded into” thepolynomial filter. Thus as shown in FIG. 6, an interpolating circuit hasthe interlaced video input signal at terminal 600 passing through FIFO602 to a polynomial filter 604. This may be of the same general form asFIG. 2 but with each of the multipliers receiving its multiplicationcoefficient dynamically from a control unit 606. The output of thefilter 604 passes through a further FIFO 608 with the control unit 606controlling the rates at which data is read into and read out of theFIFO's 602 and 608.

In still a further modification, selecting at least some of the delayelements of the filter to be field delays rather than pixel or linedelays, a temporal interpolator can be produced. It is known that theperformance of a de-interlacer can be improved for still material byemploying temporal interpolation. It is then necessary to detect motionand to adapt or switch on detection of motion from temporalinterpolation. This motion adaption is preferably conducted With priorart techniques, this switching or adaptation produces adaption artefactsthat can be visually disturbing. It is found that by using a spatialinterpolator according to the present invention, and preferably also atemporal interpolator using a similar polynomial filter, the visibilityof adaption artefacts is considerably reduced. It is believed that thedescribed non-linear behaviour of an interpolator according to thepresent invention provides a “fine” adaption, inasmuch as the value of apixel in a product of two pixels can be regarded as varying themultiplication coefficient applied to the other pixel. Adaption in theconventional sense from temporal to spatial interpolation can in thissense be regarded as “coarse” adaption. Taking numerals as anillustration, coarse adaption might be regarded as switching from +5 to−5, which is a step large enough to produce switching artefacts.Consider now that the two values of +5 and −5 are both subject to fineadaption in the range 0,1,2,3,4,5,6,7,8,9 in the case of the +5 value,and −9,−8,−7,−6,−5,−4,−3,−2−1,0 in the case of the −5 value. Now, inface of a tendency dictating a switch from +5 to −5, it is to beexpected that fine adaption will have occurred in the +5 value towards0, thus minimising the switch step. If the −5 value has similarlyundergone fine adaption towards 0, the step will be further reduced.

Whilst an important example, de-interlacing is not the only applicationfor apparatus according to the present invention. It may be moreregarded as useful with an input video signal which is undersampled,de-interlacing being then only one example.

What is claimed is:
 1. A video process comprising the steps of taking aweighted sum of pixels from at least one input picture in a filteraperture, and using said weighted sum to generate a pixel in an outputpicture, characterised in that the weighted sum includes products oftriplets of pixels, said products of triplets of pixels comprising themultiplicative product of three pixels multiplied by each other.
 2. Avideo process according to claim 1, in which spatial interpolation isconducted.
 3. A video process according to claim 2, in which a videoframe is derived through spatial interpolation from each video field ofan interlaced input signal.
 4. A video process according to claim 1,wherein the weighted sum comprises pixels and products of triplets ofpixels.
 5. A video process of interpolation comprising a process step ofspatial interpolation in which a weighted sum of products of pixelsmultiplied by each other from an input picture is taken in a filteraperture to generate a pixel in an output picture, an adaption step ofswitching between spatial interpolation and temporal interpolation, anda process step of temporal interpolation in which a weighted sum ofpixels from two or more input pictures is taken in a filter aperture togenerate a pixel in an output picture.
 6. A video process according toclaim 5, wherein said spatial interpolation takes a weighted sum ofproducts of triplets of pixels from said input picture.
 7. A videoprocess according to claim 5, wherein said temporal interpolation takesa weighted sum of products of pixels from two or more input pictures. 8.A video process according to claim 7, wherein said temporalinterpolation takes a weighted sum of products of triplets of pixelsfrom two or more input pictures.
 9. A video process according to claim5, in which said adaption step is performed in response to motion.
 10. Avideo process according to claim 9, wherein said adaption step isperformed substantially pixel by pixel.
 11. Video signal processingapparatus for interpolation, comprising an interpolation filter taking aweighted sum of pixels from at least one input picture in a filteraperture, to generate a pixel in an output picture, characterised inthat the weighted sum includes products of triplets of pixels, saidproducts of triplets of pixels comprising the multiplicative product ofthree pixels multiplied by each other.
 12. Video signal processingapparatus according to claim 11, wherein said triplets of pixels arefrom the same picture.
 13. Video signal processing apparatus accordingto claim 11, wherein the filter is continuously in circuit, duringoperation of the apparatus.
 14. A video process of interpolation, havingtraining and interpolating modes, comprising the steps of: in thetraining mode, inputting into a video signal processing apparatus havinga weighted filter, an undersampled picture from which a known desiredpicture to be interpolated, and optimising the filter weightings of theweighted filter to minimise an error between said known picture and anoutput of the video signal processing apparatus; and in an interpolatingmode, operating the filter with optimised parameters on an input signal.15. A process according to claim 14, wherein the error that is minimisedis a mean square error.
 16. An interpolating filter employing weightingcoefficients h, operating on an undersampled video signal x, there beingcorrectly sampled information available for at least a training sequenceof x to generate the desired result y of an interpolation process on x,the filter taking weighted sums of products of N pixels multiplied byeach other in a filter aperture; the coefficients h employed in theweighting being derived according to h=(X^(T)X)⁻¹X^(T)y where X is thematrix of N pixels of the signal x over the training sequence.
 17. Afilter according to claim 16, wherein the matrix X includes products ofpairs of pixels.
 18. A filter according to claim 16, wherein the matrixX includes products of triplets of pixels.
 19. A method of processingvideo information, the method comprising: spatially interpolating aninput picture taken in a filter aperture by multiplying a weighted sumof products of pixels by each other to generate a pixel in an outputpicture; switching between spatial interpolation and temporalinterpolation; and temporally interpolating a weighted sum of pixelsfrom two or more input pictures taken in a filter aperture to generate apixel in an output picture.